Quid Pro Quo: A Mechanism for Fair Collaboration in Networked 

Systems* 



Agustin Santos 
Institute IMDEA Networks, Madrid, Spain 
E-mail: agustin.santos@imdea.org 



Antonio Fernandez Anta 
Institute IMDEA Networks, Madrid, Spain 
E-mail: antonio .fernandez @ imdea.org 



Luis Lopez Fernandez 
LADyR, GSyC 
Universidad Rey Juan Carlos, Madrid, Spain 
E-mail: llopez@gsyc.es 



Abstract 



Collaboration may be understood as the execution of coordinated tasks (in the most general sense) by 
groups of users, who cooperate for achieving a common goal. Collaboration is a fundamental assumption and 
requirement for the correct operation of many communication systems. The main challenge when creating 
collaborative systems in a decentralized manner is dealing with the fact that users may behave in selfish ways, 
trying to obtain the benefits of the tasks but without participating in their execution. In this context, Game 
Theory has been instrumental to model collaborative systems and the task allocation problem, and to design 
mechanisms for optimal allocation of tasks. In this paper, we revise the classical assumptions and propose a 
new approach to this problem. First, we establish a system model based on heterogenous nodes (users, players), 
and propose a basic distributed mechanism so that, when a new task appears, it is assigned to the most suitable 
node. The classical technique for compensating a node that executes a task is the use of payments (which in 
most networks are hard or impossible to implement). Instead, we propose a distributed mechanism for the 
optimal allocation of tasks without payments. We prove this mechanism to be robust event in the presence of 
independent selfish or rationally limited players. Additionally, our model is based on very weak assumptions, 
which makes the proposed mechanisms susceptible to be implemented in networked systems (e.g., the Internet). 

1 Introduction 

Selfish behavior is becoming a subject of great concern and practical importance to network designers [1]. Game 
Theory is the approach of preference to face the design of communication systems with (potentially) selfish en- 
tities. This has lead to the proposal of a number of interesting protocols and mechanisms for networks based 
on Game Theory concepts (2j|3l. However, in the study of networks under conventional models, a collection of 
simplifying assumptions are typically made. For instance, it is assumed that selfish users are rational, that they are 
homogeneous, that they can compute a Nash equilibrium, that their utility function is known, etc. However, there 
are many systems in which these assumptions assumptions are not very realistic. 

In this paper we revisit the study of communication systems with selfish users (or players), reevaluating and 
relaxing the above-mentioned common assumptions. In particular, we propose the problem of analyzing and 
designing of a fair collaborative system under a very weak set of game theoretic assumptions. In this general 
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context, we propose mechanisms to be used to implement this collaborative system with provable properties, like 
the fairness of the system and the truthfulness of its users. The mechanisms proposed can be applied to such varied 
technologies as social and crowd computing, Web 2.0, P2P, opportunistic networks, and cloud technologies. 

As mentioned, we abstract the problem to be solved as the fair execution of tasks in a decentralized collaborative 
system. The main challenge when creating collaborative systems in a decentralized manner is dealing with the fact 
that system nodes may behave in selfish ways, trying to obtain the benefits of the tasks but without participating in 
their execution. (This is the realm of Game Theory, which has been instrumental to model collaborative systems 
and the task allocation problem, and to design mechanisms for optimal allocation of tasks.) We assume that all 
nodes have an interest on having the tasks done. However, establishing fair mechanisms for sharing the generated 
work-load is not immediate. (E.g., in current P2P systems, usually a low fraction of peers assume most of the 
required effort, and this causes reduced performance, lack of reliability, low incentive to participate for fair users, 
etc.) It would therefore be desirable that each node could take the responsibility of the execution of a balanced 
fraction of the tasks. 

The objective is to establish some kind of protocol to share the task execution costs. For this, we need to 
consider the concept of ability or opportunity of execution. Let us assume that each node has some capacity for 
timely execution of a given task. This capacity may vary over time and with the type of task. For example, at a 
given time, a node may have free bandwidth but have full utilization of its CPU, while its situation could be the 
oposite at another time. Hence, at a particular moment, a node may have greater ability to perform tasks involving 
communication, while at a later time its situation may change to prefer tasks more intensive in CPU computation. 

This opportunity or ability is related with the notion of task execution cost. In other words, we define the cost as 
some kind of metric measuring the capability of executing a particular task at a given time. Hence, the cost varies 
from one task to another (even when the task is the same, but at a later time). In Game Theory, closely related to 
cost, there is the notion of utility. We define the utility as the cost savings associated with a work not done. Hence, 
given that all nodes are interested in the execution of the tasks, a node gets more utility whenever it avoids running 
tasks by letting other nodes to do it. 

Clearly, when trying to formalize a model based on these notions, a number of problems arise. First, node's 
costs are only known by the node itself. For external entities it would be difficult to audit or check if a given 
particular node has more or less CPU capacity. In Game Theory, this concept is called private information. For 
obtaining the private information of a node, the basic mechanism is to directly ask for it and expect the node to 
declare its value correctly. 

For us, each node is a computing node that belongs to a user who can alter her node's behavior for her own 
benefit (i.e., may declare false costs trying to avoid the execution of tasks). Whenever this happens, we claim that 
the user acts in a selfish way. This selfishness is one of the factors that may distort the internal workings of a 
distributed application. The loss of system performance produced by selfish nodes is a parameter to consider and 
it is called price of anarchy HE). 

Therefore, the problem we face consists of designing a system capable of assigning tasks to nodes so that all 
the tasks are executed, and the total cost incurred is minimal. When the behavior of nodes is guaranteed to be fair, 
this is just a simple optimization exercise. However, when nodes may choose whether to be selfish the problem 
becomes much more complex. In this paper we propose an algorithm that, basing on game theory principles, 
solves this problem. We have called this algorithm Quid Pro Quo Mechanism (QPQ). The name comes from 
a Latin expression commonly used by lawyers and which may be translated as "This for that" or "A thing for 
another". This expression is often used when someone makes a job or and waits for and equivalent compensation 
in exchange. We used this expression since it reflects the spirit of the algorithm: due to the lack of payments in 
our model, the nodes work for others with the hope that others will work for them in the future. 
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1.1 State of the Art 



As described above, the problem addressed in this paper is the allocation of task executions to potentially selfish 
users. This problem has been extensively studied in the literature. One important related work was carried out 
by Rosenschein et al. where they define a "Task Oriented Domain". Even though they obtain fairly relevant 
conclusions, they do not shed any light on the specific problem considered here, since their model makes strong 
assumptions, such as knowledge of the task costs or a bargaining power over time. Recently, the use of game theory 
to model selfish behavior in the design of distributed systems has been proposed. Some works have appeared using 
mechanism design, a branch of mathematics derived form game theory, which provides the required background 
for the study and design of distributed systems under the action of selfish nodes (see, e.g., (7J|8j|9l). 

In this direction, our QPQ algorithm is similar to the mechanism proposed by Jackson et al. flOl . In that work, 
they present a new interesting type of mechanism (called linking mechanism ") which, instead of offering incentives 
or payments to players, limits the spectrum of players' responses to a probability distribution known by the game 
designer. In that paper the authors proved that a linking mechanism is valid when the players' possible decisions 
are distributed following discrete probabilities. Additionally, the authors show that a linking mechanism can also 
be used for repeated games. Even though the work of Jackson et al. is very relevant to the problem we consider, 
it does not offer a method for the construction of mechanisms when the game is based on unknown continuous 
probability distributions, as assumed here. A second work that explores the idea of linking mechanism due to 
Ferenc ifTTTl . In that paper, he proposes a mechanism which limits player responses by restricting the first two 
moments (mean and variance) of the probability distribution, being that distribution known to the designer. Both 
works reflect the main idea behind the concept of linking mechanism: when a game consists of multiple instances 
of the same basic decision problem (e.g., saying yes or no, choosing among a number of discrete options), it 
is possible to define selfishness-resistant algorithms by restricting the players' responses to a given distribution. 
Hence, in that case, the frequency with which a player declares a particular decision is known beforehand. 

In the specific areas of computing and communications, it is important to remark that most mechanisms pro- 
posed for dealing with selfish agents make unrealistic assumptions lTT2ll . In this direction, Bauer et al. |[T3l criticize 
many of these hypotheses, reviewing well-known works lPT4l [131 [T6l to show that they are not applicable in real 
environments. Specifically, they identify two common strong artificial assumptions: 

1 . The assumption that the designer of the algorithms has some knowledge about the preferences of the nodes. 

2. The assumption that the interaction among players is limited to a single round (while it is well known in the 
literature that a solution for a single round does not necessarily apply when the game is repeated). 

1.2 Contributions 

In this paper, we face the problem of task allocation relaxing these (and other) common hypotheses, so that the 
obtained results can be applied in real environments. Hence, the contributions of this paper are twofold. First, to 
the best of our knowledge, this is the first work proposing a linking mechanism solution without prior knowledge 
of the distribution of the players' decisions, and without a payment system among them. Second, we generalize 
and improve previous works in the area to provide algorithms which are susceptible of being applied in the context 
of repeated task execution allocation in real communication and computing systems, even in the presence of selfish 
or non-rational users. 

As we previously claimed, we do not want to restrict our mechanism to a set of unrealistic hypotheses. In- 
stead, we establish a number of requirements that our model must satisfy. These requirements should provide the 
appropriate flexibility to guarantee the applicability of our results in real environments. 
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Abstract utility metrics We assume, as an abstract notion, that the cost of a executing task to a nodqj depends 
on its interest on the task, its opportunity or ability to execute it, or its degree of willingness to cooperate. We need 
to accept that each node may measure this parameter in its very own metric and units. Hence, for example, a node 
may decide on the cost of a task according to the occupation of its CPU, but another one may prefer to make it 
depend on its available bandwidth. In a real scenario, the number of factors that can influence the execution cost 
of a task can be extremely large. In this direction, out model must enable each node to define, in a flexible way, 
how costs (and utilities) are measured. 

No payment system Payments are, in its most basic interpretation, a way of exchanging costs. Many existing 
mechanisms base their incentive schemes on the existence of payments. For payments to be possible, it is necessary 
that all players manage a common currency reference (euro, dollar, etc.). However, given our previous requirement, 
it is not clear how we can find that shared currency reference in our model. If a node measures its costs in terms 
of, for example, reputation, it can hardly "pay" to another node that measures its costs on CPU units. Hence, in 
our work, we assume that payments are not possible. 

Player's rationality In game theory, most of the existing algorithms require players to be perfectly rational. 
This means that a player, using the available information, should always be capable of selecting the best strategy 
(the one that maximizes her utility). However, this is a controversial hypothesis which is suffering much criticism. 
Accepting this assumption means that players are capable of mathematically calculating all alternatives, which 
in some cases requires solving complex (NP-hard) problems. Clearly, this is not always feasible for all players. 
Hence, we commit ourselves to proposing mechanisms suitable for finding quasi-optimal task allocation, even in 
the presence of rationally limited players. 

Incentive to participate In relation to players rationality, even in the case in which we are able to find global 
quasi-optimal task allocation, it is possible that the behavior of rationally limited users may harm the benefit of 
other players. In this direction, we add a stronger requirement. We force to ensure an incentive to participate in 
the game to all nodes, independently on whether they are rational or not. 

No central entity A final requirements we impose is the capability of the system to work without the existence 
of any kind of central entity. This means that the proposed mechanisms must be susceptible of being implemented 
following completely distributed schemes. 

1.3 Structure 

The rest of the paper is structured as follows. In Section [2] we provide a formal definition of the problem and 
define basic terminology. In Section [3] we present a basic linking mechanism, and evaluate the issues that need to 
be faced to make it suitable for our problem. In Section [4] we present the QPQ mechanism, and formally prove its 
properties. In Section [5] we describe how QPQ could be used in real environments. Finally, Section [6] concludes 
the paper. 

2 Definitions 

To establish a formal framework for the problem, let us provide some definitions. 
Definition 2.1 (Problem). . The problem of the assignment of tasks is a tuple (T, N, C) where: 
1 We will user the terms user, player, and node indistinctly in the rest of the paper. 
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1. T = {ti,t2, . . .} is the (not necessarily finite) set of tasks that are issued to the system over time (i& is 
the task issued at time step k). We assume tasks to be atomic, independent, and of fixed duration a. (For 
simplicity we will assume a = 1, i.e, each task takes one time step to be executed.) Note that we assume 
that complex tasks may be divided into atomic tasks. 

2. N = {1, 2, . . . , n} is an ordered list of nodes or players, where N is assumed to be finite, 

3. (C(t))i G N is a vector of costs (or utilities) where Ci(t) is the cost of executing task t G T by node i. This 
information is private (only known by node i). 

It is important to remark some aspects of the above definitions. First, we assume that the set of tasks is not 
known beforehand. Tasks appear one by one in a sequence of time steps, which command our discrete time 
evolution. Hence, the arrival of a new task dictates the start of new a round of our repeated game. We assume that 
tasks are independent among them and that the execution of a task does not influence the cost of the subsequent 
ones. Moreover, we force that one task must be completely executed by the time the next task is issued. For 
simplicity, we assume that the mechanisms to coordinate the allocation of the tasks take negligible time (with 
respect to the time step). Finally, we assume that every node that is assigned a task by the allocation mechanism 
actually executes the task. 

Hence, as tasks are issued, each node i G N estimates a sequence of costs Cj(ti), Cife), ■ ■ ■ , Cj(ifc), • • • , which 
we assume as independent samples of a probability distribution o~i G A(S'j) characterizing node f s behavior. In 
this context, we denote Si as the distribution support (i.e., the range of values for which the probability is different 
to zero) and A(5j) as the set of all possible probability distributions over S{. From now on, we will consider that 
Ci is a real- valued random variable with probability distribution dj G A(S'j). To simplify the notation, we define 
realizations of this random variable as Ci(t) = Ci(t), t G T. When clear from the context, we may remove the 
task t from the notation Ci{t), as q. 

Given that all players enjoy the result of any task executed in the system, we can define the utility of a player 
as the savings obtained by not executing some tasks (i.e. the benefit obtained from participating in the cooperative 
computing scheme and not making all the work by itself). That is, the utility Ui(t) of node i corresponding to a 
given task t is given by 



and the total utility of node i is Ui = ^2 t&T Ui(t)iQ 

We define Ui as the random variable associated to the total utility of node i. In a similar way, we denote by Wi 
the real-value random variable associated to the actual player i's executed cost and by Wi(t) its concrete realization 
for task t. Note that each task could be executed or not by a particular player. Hence, 



Finally, we assume that communication between players is reliable and concurrent. In particular, in the mech- 
anisms we propose all players exchange their values Cj(i). We assume that these values are correctly received 
by the players in a time that is negligible with respect to the time step (hence the reliability property). Addition- 
ally, we assume that each player sends its value before receiving the value of any of the other players (hence the 
concurrence property). 

2 In game theory it is common to add a discount factor (S) in time. We have assumed it to be equal to 8 = 1. 




(2.1) 



E[d] = E[U t + Wi] = E[Ui] + E[Wi\. 



(2.2) 
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3 Basic Linking Mechanism 



As mentioned above, a linking mechanism is applicable to repeated games where the decision (also know as 
message) of players is restricted to a particular known set. In our problem, the decision is the cost Ci(t) of the task. 
With this concept in mind, let us define our first algorithmic attempt to solve the problem by applying a linking 
mechanism, presented in Algorithm [T] 

Algorithm 1 Simple linking mechanism (code for node i, and a generic task t, omitted) 
1: Estimate and publish the cost Cj of the task 
2: Wait to receive the costs Cj from the other players 
3: for all j G N do 

4: if not Accepted (cj, Historic j) then 
5: Cj <— Random(c-j) 

6: end if 

7: Historicj <— Historicj U {cj} 
8: end for 

9: d <— argmin Cj 

jeN 

10: if d = i then 

11: execute the task 

12: else 

13: do nothing (node d will execute the task) 

14: end if 



As it can be observed, for each task, each player estimates the cost of computing the task and publishes it. 
Publication means broadcasting a message with the cost to all players (although any other means of distribution, 
like shared memory, can be used). By assumption, a player sends it costs before it receives any of the others 
(concurrency, which implies that they do not depend from each other), and all of the costs are correctly received at 
each player (reliability). Then, the algorithm assigns the task to the player that publishes the lowest cost. If players 
publish their real costs, this will produce that the total utility is maximized. However, this kind of approach could 
drive selfish users to publish fake costs in order to avoid executing tasks. For this reason, we add an acceptation 
test. When a published cost is not considered acceptable, then the system generates a random value for the cost of 
that node on the round. The implementation of this acceptation test will be discussed later, however it is important 
to remark that it contains the linking part of the mechanism (it depends on the historical values published by 
that particular node). Just as an example, we can imagine that if we mandate that nodes must publish costs 
between and 1 following a uniform distribution, then we could consider unacceptable values deviating from that 
distribution. It is also important to note that all nodes use the same acceptance test with the same history. Then, 
they all accept or reject. Then, if players reject a value Cj, the value Random{c-j) generated is in fact a value 
deterministically generated from the set of values c-j = Uk^j{ck}, so that all players re-generate the same value 
for j. 

Algorithm [T] has the objective of providing intuition on how we build our mechanism, but it clearly has several 
issues that contradict our previously stated requirements. In particular, fair allocation is not guaranteed. For 
instance, there is not a way of defining a notion of fairness within this algorithm, given that costs may have 
different meanings for different players. Additionally, given that costs are abstract notions, we cannot have any 
a-priori information on the shape of their corresponding distributions. So, it is not clear how to implement the 
acceptance test. 

Digging into these problems, it is easy to understand that one of their causes is the fact that, given our require- 
ments, each player has the right of measuring her costs on her preferred metric. (Hence, each player may have 
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different distributions with different supports.) For this reason, cost comparisons cannot be easily made. Addition- 
ally, there is a second aspect that must be addressed. In the literature about linking mechanisms, authors assume 
that instances of the game (rounds) are simultaneous in time. In this case, defining the acceptance function over 
the set of values is easier. However, in our case, tasks are issued, and hence players generate their costs, over time. 
Then, from the point of view of the designer, it is not clear how to determine the acceptance of a value by compar- 
ing with a certain probability distribution. The issue is even worse given the fact that this distribution is not known 
by the designer. To solve all these problems we propose a novel solution based on applying a transformation over 
the utility function. 

Utility normalization Given that the utility is defined as the work not done by a node, we may use as utility 
function of a node its probability distribution of costs. Once this is done, we may modify Algorithm [T] and 
normalize players' utilities so that they may be compared among each other. To normalize we use a transformation 
called Probability Integral Transformation (PIT). Our idea is to use the known fact that any cumulative probability 
distribution function has in itself a uniform distribution iTTTl . More formally, the PIT is defined as 

Definition 3.1 (Probability Integral Transformation). Let X be a continuous random variable with a Cumulative 
Distribution Function (CDF) F; that is X ~ F. Then, the probability integral transformation defines a new 
random variable Y as: Y = F(X). 

As mentioned above, our interest in the PIT is due to the following lemma. 

Lemma 3.1 (PIT follows uniform distribution). Let X be a continuous random variable with CDF F, then F 
follows a uniform distribution on interval [0,1]. That is, the random variable Y defined by the probability integral 
transformation Y = F(X) is a normalized uniform distribution. 

Note that X does not need to be a continuous random variable. In the case that the player's costs follow 
a discrete distribution, it is still possible to perform a similar transformation called Generalized Distributional 
Transform |[T8l . whose properties are equivalent to those of the PIT. 

Definition 3.2 (Generalized Distributional Transform). Let X be a random variable (not necessarily continuous) 
with a cumulative distribution probability F and let V ~ t/(0, 1) be a random variable with uniform distribution 
in [0, 1] independent of X. The modified distribution function F(x, A) is defined as 

F(x, A) = Pr(X < x) + \Pr(X = x). 

From this, we can define the general distributional transform of X as Y = F(X, V), which can be proved to be a 
uniform distribution on the unit interval. 

Proofs of these properties can be found in |[T8l . Many studies in economics use this definition and its properties, 
such |fl9l or |2"01 . In our case, to simplify the notation, we just call PIT to both transformations independently on 
whether the base distribution is continuous or discrete. 

Coming back to Algorithm[T] our idea is to modify it by applying the PIT on the players' declared costs. Hence, 
instead of publishing the values from its real probability distribution, a player must publish the normalized ones, 
so that the new algorithm chooses for execution the player minimizing the normalized cost values instead of the 
original costs. Fig. [T] illustrates this process. 

Based on these arguments, it is clear that the PIT provides a mechanism for comparing (normalized) node 
costs. However, we may wonder if the proposed transformation is valid, in the sense that it may not preserve the 
preferences of the player. To solve this issue, it suffices to notice that, what we are doing is changing the space 
of preferences. Therefore, the PIT somehow means that, instead of asking the user "How much does it cost to 
execute the task?", we inquire for something like "What percentage of tasks do you prefer to this one?" At the end 
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Figure 1: At the top, we can see the execution task cost histograms of two different players. Note that they follow 
different probability distributions. At the bottom, we depict the Cumulative Distribution Function (CDF) for both. 
As it can be noticed though the depicted arrows, the fact that player A has a minor cost than player B (0.3 versus 
11) does not mean that player A will be assigned the task. Instead, when applying the PIT, player B is the one 
publishing the lower normalized cost. 

of the day, and for our objectives, these questions are requesting the same information, but the latter is normalized 
in the interval [0, 1], which is a great advantage. 

Although from an analytic point of view we assume that players could compute the PIT perfectly, in a practical 
set up players do not need to consider any a priori distribution of probability. They can simply generate costs using 
their particular distribution and apply the PIT using the successive generated samples. This process uses what in 
statistics is known as the Empirical Cumulative Distribution Function (ECDF). We will review this concept later, 
when we analyze the practical formulation of QPQ in subsection [5] 

Acceptance test Once we know the properties of the PIT, it is clear how we can implement the linking mecha- 
nism for the acceptance test. The idea is that any player applying correctly the PIT on her real cost distribution, 
must generate a uniform distribution on the unit interval on her published normalized cost values. Hence, from 
the point of view of the mechanism designer, the problem consists on determining whether these published values 
follow or not that uniform distribution. There are a wide range of tests that allow checking that. These tests are 
called Goodness-of-Fit (GoF) tests. 

Continuing with this argument, we propose to implement the acceptance test of our algorithm by using some 
GoF test on the declared transformed sequence of costs published by the player. Whenever a player is honest and 
she declares the values by applying the PIT transformation on her own distribution, these values will be uniformly 
distributed in the unit interval. In that case (with high probability) the GoF tests will accept the samples. More 
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important, this process has an error which tends to zero when the number of samples (rounds) increases for any 
reasonable value of the threshold. For the study of our analytic results, we assume that GoF tests are perfect and 
this error is zero. (We will review this concept again in our practical implementation of QPQ, in Section|5]) 

Punishment In the case that a dishonest player tries to avoid the execution of tasks, one possible strategy is to 
generate increasing cost values, so that the PIT transformed values are close to the unit. However, this type of 
behavior is quickly detected by the test. An open question is how to establish a punishment to this and any other 
player whose GoF test comes out negative. One possibility is to force the node to execute the task. Unfortunately, 
this policy would force fair players to execute tasks in cases of false negatives. 

Another possibility, inspired on previous works on linking mechanisms, is to reject the value declared by the 
player and generate a new random value according to the normalized uniform distribution. Additionally, we require 
that no central entity exist on the system. For these reasons, we propose to use a deterministic (repeatable) random 
generator that any of the remaining nodes can use to calculate the new value. (We deal with the practical aspects 
of this approach in Section[5]) At a first sight, this strategy may seem a very poor punishment, given that there is 
always a chance that a player emerges victorious of a lie. However, later in this paper we will prove that this is 
not only enough to discourage dishonest players, but also a crucial ingredient to guarantee that our mechanism is 
strategy-proof. 

4 The Quid Pro Quo Mechanism 

After describing the different ingredients of our solution, we are able to propose the final algorithm, which we call 
the Quid Pro Quo (QPQ) mechanism. The details can be observed in Algorithm [2] 

Algorithm 2 Quid Pro Quo mechanism (code for node i, and a generic task t, omitted) 

1: Estimate the cost Cj of the task 

2: Publish the normalized cost Cj = PIT(ci) 

3: Wait to receive the normalized costs Cj from the other players 

4: for all j e N do 

5: if not GoF_Test(cj, Historic j,p-thj) then 
6: Cj Random(c-j) 

7: end if 

8: Historic j <— Historic j U {cj} 

9: end for 

10: Let d = argmin dj . 

11: if d = i then 

12: execute the task 

13: else 

14: do nothing (node d will execute the task) 

15: end if 



Note that we use c, to denote the PIT-normalized cost to the published, while a is the actual cost. We also 
put in Cj the pseudorandom value that replaces the value published by i when it does not pass the acceptance 
test. (Hopefully context will allow disambiguation.) It is important to notice that the algorithm is the same for all 
participants, and that it is based on information known by them. Therefore, no central entity is required. When a 
task is issued, each node can estimate its own cost and publish its PIT-normalized value. This value is then received 
by all other players. When a player has all the values, she checks whether any player published a dishonest value 
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by applying the GoF test. If the value does not pass the test, it is regenerated as described above, by using a 
pseudorandom generator (that allows all players to generate the same value) of uniformly distributed values in 
[0, 1]. With these reviewed values, the player proceeds to determine if its own value is the minimum, in which 
case it executes the task, publishing the results to the rest of nodes if necessary. 

In the following sections, we formally study the expected harm (or reduction of benefit) that dishonest behavior 
causes on QPQ. Intuition says that the loss due to a dishonest player should be comparable to having that player 
executing tasks at random. Indeed, we show below that, independently of their behavior, nodes may never expect 
a profit of less than the one obtained through a mechanism in which tasks are randomly assigned. This property is 
very useful in case the node is not capable of accurately evaluating its costs (it is non-rational). 

Another important aspect is that QPQ guarantees a minimum benefit to the entire system, even if one or more 
players are non-rational or rationally limited. In this sense, we will show that the best strategy for any player is to 
act as if the rest of the players were rational and fair. That is, incorrect behaviors of some players does not alter 
the strategy of correct players. In next section, we prove all these claims in a formal way. 

4.1 Formal Analysis of QPQ 

Our QPQ algorithm is strongly inspired on the work of Jackson et al. ifTOl . Hence, some of our proofs have 
been adapted from the ones provided there. We review now the most relevant properties of the QPQ mechanism 
presented in Algorithm[2] Assuming that the number of rounds (tasks) is large enough, and that players' costs are 
independent to each other, we prove the following properties. 

1 . QPQ is optimal in the sense that it minimizes the total work done when all players are honest. 

2. For any player, the rest of players can be seen as a single aggregated player. For each task, the aggregated 
player's cost is the smallest of its members'. These costs follow a Beta distribution. 

3. The best strategy of a player is independent of the behavior of the rest. 

4. The strategy that optimizes the utility of a player is being honest. In game theory terminology, this means 
that QPQ is strategy-proof. 

5. Each player always obtains a positive expected utility, which is determined by the number of players. 

6. An irrational or rationally-limited player always obtains a positive profit. 

7. The system is fair in the allocation of tasks and in normalized effort. That is, all the players will run the 
same number of tasks and perform a similar normalized effort (in expectation). 

8. When the number of player is high enough, QPQ ensures very attractive performance. 

To address the mathematical analysis of the algorithm we will assume that the PIT and GoF steps are perfect. In 
fact, with a large number of samples, these processes have errors close to zero. Another aspect that will simplify 
our analysis, is the idea of aggregated player. We evaluate the performance of a node playing against a "fictitious" 
node that aggregates the responses of all other nodes. This aggregated player behaves by publishing at each round 
the minimum of all the normalized costs of the players in the aggregation. This approach is compatible with all 
the assumptions of the model and is helpful because it significantly simplifies the analysis. 

To make our notation clearer, given a task, we use x to denote the true normalized cost q of player i for that 
task, while X or Xi is the random variable for that value. When executing QPQ, players may publish x = Ci or 
another false value. In that case, we use z to denote that dishonest value q and also, overloading the notation, 
the re-generated random value replacing it when the GoF test fails. We assume that the z values are realization 
of some random variable Z. Given a task with cost Cj, the player obtains a normalized utility Ui = Cj when 
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she does not execute the task (independently on what she published) and makes a normalized work of Wi = Ci 
when she executes the task (where W denotes the random variable). Additionally, we use y to denote the value 
min c-i published by an aggregated player. Following mechanism design notation, we say that the (social) decision 
function d of QPQ is 

d = argmin Cj. 

Then, we define Pr[d = i] as the probability that player i declares the minimum value and executes the task. 
When working with the aggregated player, Y is a vector of random variables, and we use Pr[Y < y] to denote the 
probability that at least one element of Y, say j, validates Yj < y. 

With this notation in mind, we can prove that, for any player i, the expectation of the declared costs is equal to 
the expected utility plus the expected work. Additionally, this quantity is a constant. I.e., 

- - - - f l 1 

E[d] = E[U t + Wi] = E[Ui] + E[Wi] = / Cidc t = -. (4.1) 

Jo 1 

This means that a player maximizes her utility when she minimizes her work, and vice-versa. In the following 
propositions, we will use this fact. 

Players' normalized costs distributions We argue here that all players' normalized costs follow independent 
uniform distribution on [0, 1]. When players are honest, their report values follow a uniform distribution on [0, 1]. 
This follows from the properties of the PIT transformation introduced above. On the other hand, when a player 
is dishonest, it may change the distribution of its normalized costs trying to obtain extra benefit. However, we 
assume that in this case the GoF test fails. Then, her attempt will be detected, and the value will be replaced by 
pseudorandom value drawn from an independent uniform distribution on [0, 1]. A final case is that the dishonest 
player may generate fake normalized cost that follow a uniform distribution on [0,1], hence passing the GoF test. 
In this case the normalized cost c$(i) for a task t is independent from the values Cj(t) of the other players, since 
from concurrency the value has to be sent before the others are received. Hence, the following result. 

Proposition 4.1. The set of final normalized costs considered in Line 10 of Algorithm^are drawn from indepen- 
dent and identical distributed ( iid) random variables, with uniform distribution on [0, 1]. 

Optimality The QPQ algorithm is optimal in the sense that, if all players are honest, it minimizes the total 
normalized work done. 

Proposition 4.2. Assume that all players are honest. For a given set T of tasks, there is no mechanism M such 
that E[^2™ =1 W^] < -E'EILi Wi], where W^Hs the random variable associated with the normalized work done 
by player i when using mechanism M. 

Proof. The proof is straightforward using contradiction. Assuming that such mechanism M exist, there must be, 
at least, one task for which w M < w, however, the social decision function of QPQ always selects the player 
publishing the minimum of the normalized costs, so it is not possible that M is able to select another player 
capable of executing with less cost. So, we conclude that M cannot exist. □ 

Aggregated player It is assumed that players' normalized costs have independent uniform distributions on [0, 1]. 
Hence, the probability density function of each player i is /j(cj) = 1 on that interval. Thus, the costs of an 
aggregate player for n — 1 nodes follows a probability distribution Beta(y; 1, n — 1) as shown. 

Proposition 4.3. The costs Y of the aggregated player ofn — 1 i.i.d. players (with uniform distribution on [0, 1]) 
follows a Beta(y; 1, n — 1) distribution, with density probability function f(y) = (n — 1)(1 — y)( n ~ 2 ^ and CDF 
F[y] = Pr[Y < y] = 1 - (1 - y) n ~\ 
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Proof. Recall that the cost of an aggregated player is the minimum of the normalized costs of the players in 

the aggregation. The CDF F{-) of that cost can be obtained as follows. Let us assume that the players in the 
aggregation are 1 to n — 1. 

F[y] = Pr[Y < y] = 1 - Pr[Y > y] (4.2) 

n-l 

= 1 - H Pr[Yj > y] (4.3) 

= 1 - ( f ldip) (4.4) 



y 



= 1 - (1 - y) n - L (4.5) 

Where Yj is the random variable associated with the normalized cost of node j. Hence, the density probability 
distribution is 

/(y) = (n-l)(l-y)(™- 2 ). 
The Beta distribution is defined as follows [21 ]. 

Beta(y; I, n - 1) = — 1 ~ y)^' 1 (4.6) 

B{1, n — 1) 

(1 + (n- 1) - 1)!, .„ 9 

= - — — -l-v) (4 7) 

((n-l)-l)! 1 V) ' 1 ; 

where B{-, ■) is the Beta function. Now, it is easy to check that f(y) = Beta(y; 1, n — 1). □ 

Players' strategies Every rational player knows that the rest of players follow uniform and independent distribu- 
tions. The question a selfish rational player makes is which is the best strategy for obtaining the greatest possible 
benefit. If a player uses a distribution other than the uniform, her values will be rejected by the GoF, and will be 
re-generated from a uniform distribution. However, a player could lie following a uniform distribution that is not 
independent of her actual values. Note that QPQ does not know about true normalized costs (they are private) and 
uses for the assignment decision the declared value or the random value assigned by the system if a lie is detected. 
In both cases, the aggregated player see a random variable Z that must follow a uniform distribution. We show 
now that either case drives the player to worse results that her own honest distribution, then that player will no 
have any incentive to cheat. 

Let us first quantify the expected work done by honest players. 

Proposition 4.4. The expected normalized work E\Wi\ done by an honest player i is 



Proof. Recall that we assume that player i is in the system with an aggregated player of n — 1 nodes. Then, 

Pr[d = i] is the probability that player i publishes a normalize cost smaller than the one of the aggregated player. 

E[Wi] = f xPr[d = i]dx (4.8) 

1 H 

x (n-l)(l-y) in - 2 Uydx (4.9) 

II Jx 

' (4.10) 



n + n 2 



Notice that we use the probability distribution of the aggregated player derived in Proposition 4.3 □ 
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Proposition 4.5. The total normalized work done by an aggregate player j (aggregating n — 1 nodes), with costs 
x = Cj, does not change when a player i (not in the aggregation) declares dishonest values z = di. 

Proof. Let us abuse the notation and use z to denote the dishonest values declared by i if the GoF is passed or the 
re-generated values if it does not. Let Z be the uniform random variable associated with these values. We assume 
that there is a bi-variate probability distribution with density f X:Z (x, z) that relates both values x and z. In that 
case, the marginal distribution for z must be uniform. Therefore, we have, 

fx(x)= I f x , z {x,z)dz = l (4.11) 
J o 

fz(z)= f f x , z (x,z)dx = l (4.12) 
Jo 

f XiZ (x,z)dzdx = l. (4.13) 



// 

Jo Jo 



Hence, the expected work done by j is 



E[Wj] = I yPr[d = j]dy (4.14) 



o 



= f 1 y(n - 1)(1 - y) (n " 2) ^ ^ f x , z {x, z)dzdxdy (4.15) 

Jo Jo Jy 

= ^ y(n - 1)(1 - y) {n ' 2) C C f x , g (x, z)dxdzdy, (4.16) 

Jo Jy Jo 

where is the expected work done by the aggregated player j when player % lies. But, as we have uniform 

marginals, the above expression becomes 

E[Wj] = [\(n - 1)(1 - y) (n - 2) f Idzdy (4.17) 

' y 

\-y)dy (4.18) 
ly (4.19) 

(4.20) 

Which is equal to the total wok done by the aggregated player j when i is honest, that can be computed as follows. 

f 1 y(n - 1)(1 - y) {n - 2) ^ ldxdy = C y(n - 1)(1 - y^dy (4.21) 

Jo Jy Jo 

n — 1 

(4.22) 







= f y(n- 


1)(1 


Jo 








= y(n- 


1)(1 


Jo 








= y(n- 


1)(1 


Jo 




n-l 




n + n 2 





n + n 2 

□ 

In summary, an aggregate player j expects to performs the same amount of work, independently of the behavior 
of a given player i not in the aggregation. I.e., its expected work is not affected by whether % is honest or dishonest. 
This allows us to prove that the optimal strategy for a player is to be honest. 
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Proposition 4.6. A player i never does more normalized work (in expectation) by being honest. That is, 



E[Wi] < E[Wi_ 



(4.23) 



where E[Wi] is the expected work performed by player i when it is dishonest. 

Proof. For the sake of contradiction, let us suppose this proposition is false. Hence, there is some set of tasks 
for which, if i i s not honest, it performs less work in expectation. I.e., E'fWi] > Additionally, using 

we know that the aggregated player, j, will do the same expected work, i.e., i£[Wj] 



4.5 



EW. 



31- 



Proposition 

Hence, it follows that _E[Wi] + > i?[Wi] + E[Wj]. However, if the above inequality were true, QPQ would 

not be optimal, since a mechanism that reproduces the same task assignments done under i lying (in presence of 
honest players would have less expected work). Clearly, this is in contradiction of Proposition [472] Therefore, the 
best strategy for a player (the one minimizing her normalized work done) is to be honest. □ 

We complement this result with the following property. 

Proposition 4.7. When a player i publishes dishonest non uniform values or values independent of her true 
normalized uniform distribution, it performs in expectation E \Wj\ = ^ work. 

Proof. The values z used to decide whether to assign a task to player i follow a uniform distribution that is 
independent of the actual costs for i. Hence, 



E\WA 



xPr[d = i]dx 



1 rl f -i 
x / / {n-l){l-yf n -^dydzdx 

JO Jz 
1 



2n 



(4.24) 

(4.25) 
(4.26) 

□ 



From this result, Proposition 4.4 and Eq. 4.1 we directly derive the following theorem. 



Theorem 4.8. Given that n > 2, it holds that E[Wi] 



< 



2n 



E\Wi\. Hence, since the sum of the 



expected work and expected utility is |, players obtain higher expected utilities by being honest than by publishing 
dishonest normalized costs. 



Real expected utility Note that the normalized work done by a honest player, as calculated above, is equal to 
n ^ n 2 . But we may wonder what is the real (not normalized) work done. We can easily calculate it in terms of real 
utility as follows. 

Theorem 4.9. For each player i, the real expected utility is 

E^} = [ xfi(x)(l - (1 - Fiix^-^dx. 
Ju 

where the real cost of player i is a continuous random variable with support Q, probability density function /?(■), 
andCDFFi(-). 
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Proof. Let Yj = PIT(Xj) = Fj(Xj) be the uniform random variable that gives the normalized cost for player 
j ^ i at the time of assigning the tasks (Line 10), and Y = min^ {Yj}- Then, the expected (real) utility of player 
% is: 



E[Ui] = I x/,(x)Pr(y < Fi(x))dx 
Jn 

x/ l (x)(l-(l-F i (x)r- 1 )dx, 



n 

n-l 



where we have used that Pr[Y < y] = 1 — (1 — y) n (Proposition 4.3 1. □ 



4.1.1 Fairness 

The following result, combined with Proposition [47TJ will be used to show that all players execute, on expectation, 
the same number of tasks, even when some players are non-rational or dishonest. 

Proposition 4.10. Let Xi, X2, • • • , X n be n continuous and i.i.d. random variables, then: 

Pr(Xi < min{X,}) = -. 

j^i n 

Proof. 

/oo ry 
f(y)[ f(x)dx}^dy 
-OO J — CO 

OO 



f{y)[F{y)t-^dy 
[F(y)]^dF(y) 

{F(y)Y 



00 

00 



n 
__ 1 
n 

□ 

Hence, QPQ not only offers best utility guarantee to honest rational players, but it also offers good properties 
in environments where nodes have difficulty in estimating costs. This is because, even in environments where the 
nodes are non-rational, QPQ divides the work fairly and optimaly with respect to the declared normalized costs. 
Clearly, non-rational players run major efforts, but it is always under completely random task assignments. In 
other words, the extra cost of non-rational players is caused by their own ignorance, not by the wickedness of 
the other players. Then, given that players are are assigned tasks by choosing the smallest value from a set of 



i.i.d. random variables (Proposition 4. 1 ), QPQ ensures that the expectation of the number of tasks executed by 



each node is \T\/n, where recall that T the set of tasks and n the number of players. 

Corollary 4.11. In QPQ, players will execute in expectation a proportion of ^ of the tasks, and thus a total of 
of tasks. 



\T\ 
n 



Proof. Declared values follow continuous and identically distributed (uniform) random variables in [0, 1], from 
Proposition |4.1| and therefore applying Proposition 4.10[ each player will execute in expectation a proportion of 



- of the tasks. □ 
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4.1.2 Bounds 



Finally, we think that it could be interesting to define some ratio that measures how the efficiency of the QPQ 
mechanism degrades with the selfish behavior of the players. Following concepts similar to the "price of anarchy" 
[4], we define the measure of efficiency as the ratio between the utility of an equilibrium (usualy the "worst 
equilibrium") and the utility of some optimal solution. 

Obviously, the player's normalized utility must be between 0, when the node runs all tasks, and | when the node 
has not executed any task. But there are two levels that may be considered as references to establish the goodness 
of the algorithm. On one side, when a node runs completely random ^ tasks, the expected effort is j-. On the 
other hand, the maximum benefit a player i could get occurs when its tasks correspond exactly to her cheapest 
tasks. In this case, the expected utility would be 



EPI] = 7T " E \Wi\ (4-27) 



i 

1 



o 



xdx (4.28) 

(4.29) 

Although this case has null probability, we propose to use this concept for our definiton of measure of efficiency. 



2 

1 1 

2 ~ 2k 2 ' 



Definition 4.1 (Measure of efficiency). . We define the measure of efficiency of an algorithm M for tasks assign- 
ment under selfish behaviour as the ratio between the expected normalized utility obtained under some equilibrium 
and = \~^ = ^. I.e., 

E[U M ] 2n 2 E[U M ] 
Efficacy = ^ = 

Hence, we can compute the efficiency of QPQ as 



Efficiency = = = = - — > 



1 n 2 — 1 n 2 — 1 

Note that the efficiency of QPQ is close to 1 when the number of participants is high. For instance, with just 10 
nodes the efficiency of QPQ is 0.991. 



5 Implementing QPQ in real environments 

In this section, our objective is analyzing what are the restrictions for QPQ to be implemented in real environments. 
From above sections, we may claim that the computation and communication capabilities required by the algorithm 
are affordable with current technology. We do not claim that implementing such capabilities would be an easy task, 
since there are many technological challenges that should be addressed to do it. Other previous works show some 
of them [22]. Thus, our only claim is that it would be feasible. 

However, going beyond the required communication and computation capabilities, we may see that a number 
of issues arise. The first of all is on the definition of selfishness itself. This paper is mainly focused on detecting 
and neutralizing users publishing values not coming from the PIT of their real costs. However, one can claim that 
other non-cooperative harmful behaviors are possible such as, for example, not executing tasks at all, or executing 
them incorrectly. Hopefully, most of these evil conducts can be easily avoided using a two step scheme. First, by 
detecting such behaviors (previous works on the area show that it is possible 112311241 ). Second, by establishing a 
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strong enough punishment to discourage misbehaving players from repeating them. For example, we may adopt 
the radical solution of just sending off misbehaving users. In order to guarantee that reoffending players participate 
again, all that is needed is that users identities are unique and cannot change on different game instances. Note 
that QPQ does not discard misbehaving users, because it assumes that the publication of dishonest values cannot 
be distinguished from the publication of values generated from rationally-limited players, and it would not be 
reasonable to send off the latter from the game given that, in a realistic scenario, all players would have some 
rationally limitations (i.e. it is not possible to estimate costs with total accuracy). Hence, QPQ's approach of 
keeping them in the system is one of the most difficult ways of dealing with selfish users. 

Coming back to the subtleties of QPQ, another point to consider is how to re-generate the random value when 
the system detects a lie. As we said before, we require a deterministic (repeatable) random generator that any 
of the remaining nodes can use to calculate the new value. One possibility for generating the random value is to 
use a hash function over the published normalized cost of other nodes. Alternatively, it is possible to request a 
random value to each player (except the value of player in question) and apply the hash function on them. Even 
another possibility is to use techniques similar to the procedures proposed by Aumann et al. [25 ] to generate jointly 
controlled lotteries. For example, for two players, we can request random values to both, and replace the value of 
the liar's by the sum of these numbers, if the sum is less than 1, or with one minus the sum otherwise. With this 
scheme, it is easy to show that when one of the player declares random values according to a uniform distribution, 
then this process generates random values also uniformly over [0, 1] , regardless of what the other player does. As a 
conclusion, we may claim that there are several mechanism suitable for the generation of the punishment random 
value independently on the behavior of a dishonest player. 

Another obstacle that stands on the way of a potential implementation of the mechanism is the acceptance test. 
We have assumed that we have a perfect GoF test function. This is somewhat similar to assume that we have a set 
of samples whose number is very large (ideally infinite) for detecting lies with the usual tests. In a real system, 
this solution is impractical since nodes would require to store all the historical values of the rest of players, and 
initially the number of samples is necessarily limited. As we saw before, we propose that players simply generate 
costs using their particular distribution and apply the PIT using the successive generated samples. The CDF used 
for the PIT is synthesized from the existing samples y,. This CDF obtained from samples is known in statistics as 
the Empirical Cumulative Distribution Function (ECDF). 

Definition 5.1 (Empirical Cumulative Distribution Function). The empirical cumulative distribution function 
(ECDF) F n for n observations yi is defined as 



where 1{A} is the indicator function or the characteristic function of event A. In our case, it is defined as 



Obviously, this process has an error which tends to zero when the number of samples (rounds) increases as it is 
proved by Glivenko-Cantelli theorem ll26l . 

Regarding the GoF used, a tremendous number of GoF tests have been proposed in the scientific literature. 
Some of them may be applied over discrete distributions and others require continuous distributions. The Kolmogorov- 
Smirnov (KS) test Ii27"ll28l is probably the best-known test for continuous distributions, basically due to its sim- 
plicity. The KS test calculates the greatest distance between the ECDF associated to a sequence of samples and 
the CDF we want to check. It may be defined by the following expression: 



Fn(x) = -Y j l{y l <x} 

n. ^ — » 




D = max (F(xi) 

Ki<n 



i — 1 i 



n 



n 
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where F(-) is the CDF to check, n is the number samples, and (xi, x<i, • • • , x n ) is the set of samples arranged in 
increasing order. What makes the KS test so versatile is that the distribution of the distance D does not depend 
on the theoretical probability distribution (null hypothesis). Several authors, such as Smirnov [28], Birnbaum and 
Tingey [29], have obtained exact and approximate expressions of the distribution of the variable Dasa random 
function of the number of available samples. Due the complexity of such expressions, the KS test is often used 
through tables containing the most common percentiles. 



\ <" 



Figure 2: This picture depicts the curves of our elastic p-value thresholds as function of the normalized utility of 
a player. When we have a small number of rounds (blue line for 10 rounds) our system is quite tough, but if the 
number of rounds increases (yellow line correspond to 100 rounds and green line is for 500 rounds), our proposal 
is more relaxed, and accepts values if the player's utility is within a reasonable range. 

We propose to use the KS test as the GoF test of QPQ. Hence, whenever a new normalized cost is issued, we 
check the KS test of it, together with the historical sequence of that player, so that we obtain the corresponding 
(p-value). Note that, in statistical significance testing, the p-value is the probability of obtaining a particular test 
statistic on the model at least as extreme as the one that was actually observed. Now, the value is accepted by the 
test when that p-value is over a particular acceptance threshold, p-th. 

For practical reasons, we need to reduce the history of a user to a relatively small number of samples. Hence, we 
propose a slight modification to the acceptance test of Algorithm|2]to make it implementable in real systems. With 
this modification, each node applies the KS test using only a small number of the latest published values. However, 
this makes the KS test susceptible of generating inaccurate estimations. For example, a selfish node could publish 
values following a Beta distribution (1,0.9). With high probability, this situation could not be detected with 
sample sequences of small length. In addition of choosing a large enough sample size (our simulations show 
that 50 samples are enough), we play with the threshold to refine the test. The idea is to modify the acceptance 
threshold so that it is hardened when the actual normalized utility of the player is higher than the theoretical 
expectation, and it is relaxed when players are losing more than expected. There are many ways of implementing 
this idea, but we propose the following expression 



log (A; + l)**( 1 -(A*fc-M)-Vfc) ' 

where 5 is a tuning parameter, fi is the expected normalized utility of all players and is the actual normalized 
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utility of the player at round k. To illustrate this idea we depict Fig. |2j which represents the value of this threshold 
as a function of the total normalized utility of the player. Clearly, the above formula is entirely empirical, although 
the simulations below in this paper show that it fits well our requirements. One of the reasons that has led to the 
development of this proposal has been the idea that a new player must "pay" some kind of "fee" when she enters 
into the system. In this way, we want to avoid, or at least reduce, the problem of low-cost identities or cheap 
pseudonyms. With our proposal, at the beginning QPQ asigns tasks almost randomly, while later, when we have 
more information about players, QPQ assigns tasks optimally. Each player has to "pay" at the beginning working 
in random assigments and thus, she has no incentive to exit and reenter into the system. 

The final implementable QPQ algorithm we propose may be written as presented in Algorithm|3] 



Algorithm 3 Implementable Quid Pro Quo mechanism (code for node i, and a generic task t, omitted) 
Estimate the cost Cj of the task 
Publish the normalized cost c\ = PIT(cj) 
Wait to receive the normalized costs cj from the other players 
for all j £ N do 



Let p-thj 



i 



3 log(fc+l)' 5(1 " ( ^>^ M),V ^ ) 

if not KSTest(dj, Historic j, p-thj) then 

Cj <— Random(c-j) 
end if 

Historic j Historic j U {dj} 
end for 

Let d = argmin Cj 

if d = i then 

execute the task 

else 

do nothing (node d will execute the task) 
end if 

for all j £ N do 

recompute fij^+i 
end for 



5.1 Simulations 

By performing simulations, we have checked various aspects of the implementable QPQ. First of all, we wondered 
if the new GoF test may punish fair players by generating false negatives. In this direction, Fig. [3]represents the 
boxplot of the expectation of the normalized work done in 100 rounds when all players are honest and no GoF test 
is applied. This picture serves as control and allows us to compare it with the same game but introducing the GoF 
test of Algorithm[3j using a history of 50 samples for the KS test and 6 = 2. The results are depicted in Fig. [4] As 
it can be seen, the performance loss caused by false negatives is minimal and barely noticeable in these scenarios. 

The next question is to which extend selfish users can fool the algorithm and achieve improvements in their 
utility. We have simulated dishonest behavior by using several distributions close to the uniform but with higher 
mean, by taking advantage of the properties of the Beta function, so that these distributions try to pass the im- 
plementable KS test and, at the same time, obtain some profit on the long run. Again, we have run simulations 
considering a game with two nodes, one honest (uniform) and one dishonest for a set of 1, 000 rounds, with his- 
torical lengh of 50 samples for the implementable KS test and with 6 = 2. The results can be seen in Table [T] 
which depicts the normalized player utilities for different scenarios. In the table, the name Uniform represents 



19 



Payoff Boxplot with 100 rounds 



Payoff Boxplot with 100 rounds 




Figure 3: Two honest players with no control (100 rounds). 



Payoff Boxplot with 100 rounds 



Payoff Boxplot with 100 rounds 



Figure 4: Two honest players with KS control (100 rounds). 



honest nodes, Random is used for non-rational players generating random costs and finally, "Beta" and "Normal" 
are used for dishonest players following those distributions. As it can be observed, honest utilities remain quite 
constant, while non-rational and dishonest utilities decrease, although never under a given limit. Interestingly note 
that this behavior is maintained even in the extreme case of a Beta(l, 0.9) distribution. Observe that, when the 
number of samples is small (around 50), a Beta(l,0.9) is so similar to a uniform distribution that it is hardly 
distinguishable to the eye. 

Finally, for the same simulation scenario, in Fig. [5] we compare behavior of the implementable KS test of 
Algorithm [3] for fair users (uniform) playing against a node with several manipulative profiles (Beta distribution 
variants) as the number of rounds increase. As it can be observed, the honest player rapidly gets her values to pass 
the test, while the dishonest gets into trouble rapidly because her values are rejected, even with distributions very 
similar to the uniform. 



Distributions 


Ui 


U2 


Uniform vs. Uniform 


0.332 


0.332 


Uniform vs. Random 


0.331 


0.250 


Uniform vs. Beta(l, 0.9) 


0.321 


0.258 


Uniform vs. Beta(l, 0.7) 


0.315 


0.264 


Uniform vs. Normal 


0.352 


0.250 



Table 1: Honest vs. dishonest utility /cost. 
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Figure 5: Percentage of rejected values. 

6 Conclusion and future work 

Throughout this paper, we have presented QPQ, an algorithm for optimal allocation and execution of tasks in a 
distributed environment with selfish behavior. Unlike many of the preexisting works, this algorithm proposes a 
mechanism that does not use payment or prior information on the behavior of the players. We have demonstrated 
that the algorithm is tolerant to dishonest, non rational or rationally limited behaviors without punishing fair users 
and rewarding players proportionally to their degree of truthfulness. 

The proposed algorithm may be adapted using reasonable approximations so that it can be implemented in 
real networks with affordable computational and communication complexity. For all these reasons, we claim that 
this algorithm opens new horizons for the creation of novel computing frameworks where users can openly and 
effectively cooperate to achieve a common goal, based on the collaborative execution of simple atomic independent 
tasks. 

Despite this, authors consider necessary to carry out further research to make QPQ robust to more sophisticated 
selfishness scenarios. For example, it would be necessary to consider cases in which players are not independent, 
and associate in groups trying to break the system's fairness. 

Another aspect that should be extended is related to the notion of task utility. We have assumed that all nodes 
have an interest in having all tasks done. However, in a real environment, it is possible that only a subset of tasks 
are relevant for a given node. Hence, further work should be developed to relax some of the QPQ hypotheses, and 
deal with this aspect. 

To conclude, another aspect that may be improved is investigating GoF tests other than the KS to analyze if they 
can provide advantages for real implementations of the algorithm (for instance, using just a small set of samples 
to implement the acceptation test). 
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